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Abstract 

Future miniaturization and mobilization of computing devices re- 
quires energy parsimonious 'adiabatic' computation. This is contin- 
gent on logical reversibility of computation. An example is the idea of 
quantum computations which are reversible except for the irreversible 
observation steps. We propose to study quantitatively the exchange 
of computational resources like time and space for irreversibility in 
computations. Reversible simulations of irreversible computations are 
memory intensive. Such (polynomial time) simulations are analysed 
here in terms of 'reversible' pebble games. We show that Bennett's 
pebbling strategy uses least additional space for the greatest number 
of simulated steps. We derive a trade-off for storage space versus ir- 
reversible erasure. Next we consider reversible computation itself. An 
alternative proof is provided for the precise expression of the ultimate 
irreversibility cost of an otherwise reversible computation without re- 
strictions on time and space use. A time-irreversibility trade-off hierar- 
chy in the exponential time region is exhibited. Finally, extreme time- 
irreversibility trade-offs for reversible computations in the thoroughly 
unrealistic range of computable versus noncomputable time-bounds are 
given. 



* Parts of this paper were presented in preliminary form in Proc. IEEE Physics of 
Computation Workshop, Dallas (Texas), Oct. 4-6, 1992, pp. 42-46, and Proc. 11th IEEE 
Conference on Computational Complexity, Philadelphia (Pennsylvania), May 24-27, 1996. 

f Supported in part by NSERC operating grant OGP-046506, ITRC, and a CGAT 
grant. Address: Computer Science Department, University of Waterloo, Waterloo, On- 
tario, Canada N2L 3G1. Email: mli@math.uwaterloo.ca 

"'"Partially supported by the European Union through NeuroCOLT ESPRIT Working 
Group Nr. 8556, and by NWO through NFI Project ALADDIN under Contract number 
NF 62-376 and NSERC under International Scientific Exchange Award ISE0125663. Ad- 
dress: CWI, Kruislaan 413, 1098 SJ Amsterdam, The Netherlands. Email: paulv@cwi.nl 



1 



1 Introduction 



The ultimate limits of miniaturization of computing devices, and there- 
fore the speed of computation, are constrained by the increasing density 
of switching elements in the device. Linear speed up by shortening inter- 
connects on a two-dimensional device is attended by cubing the dissipated 
energy per area unit per second. Namely, we square the number of switching 
elements per area unit and linearly increase the number of switching events 
per switch per time unit. The attending energy dissipation on this scale 
in the long run cannot be compensated for by cooling. Reduction of the 
energy dissipation per elementary computation step therefore determines 
future advances in computing power. In view of the difficulty in improving 
low-weight small-size battery performance, low-energy computing is already 
at this time of writing a main determining factor in advanced mobilization 
of computing and communication. 

Since 1940 the dissipated energy per bit operation in a computing device 
has with remarkable regularity decreased by roughly one order of magnitude 
(tenfold) every five years, [ Keyes, 1988| , Landauer, 198Sfl . Extrapolations of 
current trends show that the energy dissipation per binary logic operation 
needs to be reduced below kT (thermal noise) within 20 years. Here k is 
Boltzmann's constant and T the absolute temperature in degrees Kelvin, 
so that kT ~ 3 x 10~ 21 Joule at room temperature. Even at kT level, 
a future device containing 10 18 gates in a cubic centimeter operating at 
a gigahertz dissipates about 3 million watts/second. For thermodynamic 
reasons, cooling the operating temperature of such a computing device to 
almost absolute zero (to get kT down) must dissipate at least as much energy 
in the cooling as it saves for the computing, | Merkle, 1993 ] . 

Considerations of thermodynamics of computing started in the early 
fifties. J. von Neumann reputedly thought that a computer operating at 
temperature T must dissipate at least kT In 2 Joule per elementary bit op- 
eration, jBurks, 1966 1 . But R. Landauer Landauer, 196l| demonstrated 
that it is only the 'logically irreversible' operations in a physical computer 
that are required to dissipate energy by generating a corresponding amount 
of entropy for each bit of information that gets irreversibly erased. As a 
consequence, any arbitrarily large reversible computation can be performed 
on an appropriate physical device using only one unit of physical energy in 
principle. 

Examples of logically reversible operations are 'copying' of records, and 
'canceling' of one record with respect to an identical record provided it is 
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known that they are identical. They are physically realizable (or almost re- 
alizable) without energy dissipation. Such operations occur when a program 
sets y := x and later (reversibly) erases x := while retaining the same value 
in y. We shall call such reversible erasure 'canceling' x against y. Irrespec- 
tive of the original contents of variable x we can always restore x by x := y. 
However, if the program has no copy of the value in variable x which can 
be identified by examining the program without knowing the contents of 
the variables, then after (irreversibly) erasing x := we cannot restore the 
original contents of x even though some variable z may have by chance the 
same contents. 'Copying' and 'canceling' are logically reversible, and their 
energy dissipation free execution gives substance to the idea that logically 
reversible computations can be performed with zero energy dissipation. 

Generally, an operation is logically reversible if its inputs can always be 
deduced from the outputs. Erasure of information in a way such that it can- 
not be retrieved is not reversible. Erasing a bit irreversibly necessarily dissi- 
pates £;Tln2 energy in a computer operating at temperature T. In contrast, 
computing in a logically reversible way says nothing about whether or not 
the computation dissipates energy. It merely means that the laws of physics 
do not require such a computer to dissipate energy. Logically reversible 
computers built from reversible circuits, [|Fredkin Toffoli, 1982f| , or the re- 
versible Turing machine, [ Bennett, 1982| ], implemented with current technol- 



ogy will presumably dissipate energy but may conceivably be implemented 
by future technology in an adiabatic fashion. Current conventional electronic 
technologies for implementing 'adiabatic' logically reversible computation 
are discussed in [pVIerkle, 1993| , proc. PhysComp, 1981, 1992, 1994(1 . 



An example of a hypothetical reversible computer that is both logically 
and physically perfectly reversible and perfectly free from energy dissipa- 
tion is the billiard ball computer, [ Fredkin &: Toffoli, 1982|. Another ex- 



ample is the exciting prospect of quantum computation, Feynman, 1985 , 



Dcutsch, 1985, [Shor, 1994 ], which is reversible except for the irreversible 



observation steps. 



1.1 Outline of the Paper 



Here we propose the quantitative study of exchanges of computing resources 
such as time and space for irreversibility which we believe will be relevant 
for the physics of future computation devices. 

Reversible simulation. Bennett [ Bennett, 1989] ] gives a general re- 



versible simulation for irreversible algorithms in the stylized form of a pebble 
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game. While such reversible simulations incur little overhead in additional 
computation time, they may use a large amount of additional memory space 
during the computation. We show that among all simulations which can be 
modelled by the pebble game, Bennett's simulation is optimal in that it uses 
the least auxilliary space for the greatest number of simulated steps. That 
is, if 5 is the space used by the simulated irreversible computation, then the 
simulator uses nS space to simulate (2 n — 1)5 steps of the simulated compu- 
tation. Moreover, we show that no simple generalization of such simulations 
can simulate that many steps using (n — 1)5 space. On the other hand, we 
show that at the cost of a limited amount of erasure the simulation can be 
made more space efficient: we can save kS space in the reversible simulation 
at a cost of (2 fc+2 — 1)5 irreversible bit erasures, for all k with 1 < k < n. 
Hence there can be an advantage in adding limited irreversibility to an oth- 
erwise reversible simulation of conventional irreversible computations. This 
may be of some practical relevance for adiabatic computing. 

Reversible computation. Next, we consider irreversibility issues re- 
lated to reversible computations themselves. Such computations may be 
directly programmed on a reversible computer or may be a reversible simula- 
tion of an irreversible computation. References [ [Lecerf, 1963 , pennett, 1973 ] 
show independently that all computations can be performed logically re- 
versibly at the cost of eventually filling up the memory with unwanted 
garbage information. This means that reversible computers with bounded 
memories require in the long run irreversible bit operations, for example, to 
erase records irreversibly to create free memory space. The minimal possible 
number of irreversibly erased bits to do so determines the ultimate limit of 
heat dissipation of the computation by Landauer's principle. 

To establish the yardstick for subsequent trade-offs, we give an alterna- 
tive direct operational proof for the known exact expression of the ultimate 
number of irreversible bit operations in an otherwise reversible computa- 
tion, without any bounds on computational resources like time and space, 
Theorem |.Q 

Time-Irreversibility trade-offs. Clearly, to potentially reduce phys- 
ical energy dissipation one first needs to reduce the number of irreversible 
bit erasures in an otherwise reversible computation. This can be achieved 
by using more computation steps to drive the number of irreversible com- 
putation steps closer to ultimate limits. The method typically reversibly 



1 This is the unpu blished proof in [Li and Vitanyi, 1992]; compare with the proof in 
Bennett et al, 1993[. 
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compresses 'garbage' information before irreversibly erasing it. (A similar 
situation holds for space bounds on memory use.) 

Time-Irreversibility hierarchy. For exponential time bounds diag- 
onalization techniques are used to establish the existence of a sequence of 
increasing time bounds for a computation resulting in a sequence of de- 
creasing irreversibility costs. (These time bounds are exponential func- 
tions, while practical adiabatic computation usually deals with less-than- 
exponential time in the size of the input.) 

Extreme trade-offs. In the thoroughly unrealistic realm of computable 
versus noncomputable time-bounds it turns out that there exist most ex- 
treme time-irreversibility trade-offs. 



1.2 Previous Work 

Currently, we are used to design computational procedures containing irre- 
versible operations. To perform the intended computations without energy 
dissipation the related computation procedures need to become completely 
reversible. Fortunately, all irreversible computations can be simulated in a 
reversible manner, | Lecerf, 1963 , Bennett, 1973[| . All known reversible sim- 



ulations of irreversible computations use little overhead in time but large 
amounts of additional space. Commonly, polynomial time computations are 
considered as the practically relevant ones. Reversible simulation will not 
change such a time bound significantly, but requires considerable additional 
memory space. In this type of simulation one needs to save on space; time 
is already almost optimal. 



The reversible simulation in [Bennett, 1973] of T steps of an irreversible 



computation from x to f(x) reversibly computes from input x to output 
(x,f(x)) in T' = 0(T) time. However, since this reversible simulation at 
some time instant has to record the entire history of the irreversible compu- 
tation, its space use increases linear with the number of simulated steps T. 
That is, if the simulated irreversible computation uses S space, then for some 
constant c > 1 the simulation uses T' c + cT time and S' ~ c + c(S + T) 
space. The question arises whether one can reduce the amount of auxiliary 
space needed by the simulation by a more clever simulation method or by 
allowing limited amounts of irreversibility. 



In [Bennett, 1989|] another elegant simulation technique is devised re- 



ducing the auxiliary storage space. This simulation does not save the entire 
history of the irreversible computation but it breaks up the simulated com- 
putation into segments of about S steps and saves in a hierarchical manner 
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checkpoints consisting of complete instantaneous descriptions of the simu- 
lated machine (entire tape contents, tape heads positions, state of the finite 
control). After a later checkpoint is reached and saved, the simulating ma- 
chine reversibly undoes its intermediate computation reversibly erasing the 
intermediate history and reversibly canceling the previously saved check- 
point. Subsequently, the computation is resumed from the new checkpoint 
onwards. 

The reversible computation simulates k n segments of length m of irre- 
versible computation in (2k — l) n segments of length 0(m + S) of reversible 
computation using n(k — 1) + 1 checkpoint registers using 0(m + S) space 
each, for each k, n, m. 

This way it is established that there are various trade-offs possible in 
time-space in between T" = 0(T) and S' = Q(TS) at one extreme (k = 
l,m = T,n = 1) and (with the corrections of [|Levine and Sherman, 1990(1 ) 
T' = @(T 1+e /S e ) and S' = 6(c(e)5(l + log T/S)) with c(e) = 7N 1 for each 
e > 0, using always the same simulation method but with different parame- 
ters k, n where e = log fc (2/c— 1) and m = 0(5% Typically, for k = 2 we have 
e = log 3. Since for T > 2 s the machine goes into a computational loop, 
we always have S < logT. Therefore, it follows from Bennett's simulation 
that each irreversible Turing machine using space S can be simulated by a 
reversible machine using space S 2 in polynomial time. 



2 Reversible Simulation 



Analysing the simulation method of | Bennett, 198£ ] shows that it is essen 



tially no better than the simple [Bennett, 1972] simulation in terms of time 



versus irreversible erasure trade-off. Extra irreversible erasing can reduce 
the simulation time of the former method to O(T), but the 'simple' method 
has 0(T) simulation time without irreversible erasures anyway, but at the 
cost of a large space consumption. Therefore, it is crucial to decrease the 
extra space required for the pure reversible simulation without increasing 
time if possible, and in any case further reduce the extra space at the cost 
of limited numbers of irreversible erasures. 

Since there is no better general reversible simulation of an irreversible 
computation known as the above one, and it seems likely that each proposed 
method must have similar history preserving features, analysis of this par- 
ticular style of simulation may in fact give results with more general validity. 
We establish lower bounds on space use and upper bounds on space versus 
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irreversible erasure trade-offs. 

To analyse such trade-offs we use Bennett's brief suggestion in [ Bennett, 1989 ] 
that a reversible simulation can be modelled by the following 'reversible' 
pebble game. Let G be a linear list of nodes {1,2, . . . ,Tq}. We define a 
pebble game on G as follows. The game proceeds in a discrete sequence of 
steps of a single player. There are n pebbles which can be put on nodes of 
G. At any time the set of pebbles is divided in pebbles on nodes of G and 
the remaining pebbles which are called free pebbles. At each step either an 
existing free pebble can be put on a node of G (and is thus removed from 
the free pebble pool) or be removed from a node of G (and is added to the 
free pebble pool). The rules of the game are as follows. 

1. Initially G is unpebbled and there is a pool of free pebbles. 

2. In each step the player can put either 

(a) a free pebble on node 1 or remove a pebble from node 1, or 

(b) for some node i > 1, put a free pebble on node i or remove a pebble 
from node i, provided node z — 1 is pebbled at the time. 

3. The player wins the game if he pebbles node Tq and subsequently 
removes all pebbles from G. 



The maximum number n of pebbles which are simultaneously on G at 
any time in the game gives the space complexity nS of the simulation. If one 
deletes a pebble not following the above rules, then this means a block of 
bits of size S is erased irreversibly. The limitation to Bennett's simulation 
is in fact space, rather than time. When space is limited, we may not 
have enough place to store garbage, and these garbage bits will have to be 
irreversibly erased. We establish a tight lower bound for any strategy for 
the pebble game in order to obtain a space-irreversibility trade-off. 

Lemma 1 There is no winning strategy with n pebbles for Tq > 2™. 

Proof. Fix any pebbling strategy for the player. To prove the lemma 
it suffices to show that the player cannot reach node f(k) = 2 k using k 
pebbles, and also remove all the pebbles at the end, for k := 1,2,.... We 
proceed by induction. 

Basis: k = 1. It is straightforward to establish f(l) = 2 cannot be 
reached with 1 pebble. 
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Induction: k — > k + 1. Assume that /(z) = 2* cannot be reached with i 
pebbles, for i = 1, . . . , k, has been established. Consider pebbling G using 
A; + 1 pebbles. Assume, that the player can pebble node f(k) + 1 = 2 k + 1 
(otherwise the induction is finished). 

Then, by the rules of the game, there must be a least step t such that for 
all times t' > t there are pebbles on some nodes in f(k) + 1, f(k) + 2, . . . , Tq. 
Among other things, this implies that at step t + 1 node f(k) + 1 is pebbled. 

Partition the first f(k) — 2 nodes of G into disjoint consecutive regions: 
starting with node 1, region Lj consists of the next block of f(k — i) nodes, 
for i = 1, . . . , k - 1. That is, L t = {Ej=i +1 2 k ~ j + !,-••, E*=i 2 fe ^'}. The 
regions Li, . . . , Lfc-i cover nodes 1, . . . , f(k) — 2. Denote the remainder of G 
but for nodes f(k)-l, f(k) by R, that is R = G-{f{k)-l, f(k)}-\J^i U = 
{f(k) + l,f(k) + 2,...,T G }. 

Consider the game from step t+1 onwards. If there is always at least one 
pebble on nodes 1, . . . , f(k), then by inductive assumption the player can 
pebble with one initial pebble on f(k)+l and the remaining k—1 free pebbles 
at most /(/c) — 1 nodes and hence no further than node 2f(k) — 1 = 2 k+1 — 1, 
and the induction is finished. 

Therefore, to possibly pebble node 2 k+1 the player needs to remove all 
pebbles from nodes 1, . . . , f(k) first. Because node f(k) + 1 was pebbled at 
step we know that node f(k) did have a pebble at that time according 
to the game rules. By assumption, from time t+1 there will henceforth 
always be a leading pebble in region R. Moreover, at time t + 1 there is a 
pebble on node f(k). To remove all the pebbles in range 1, . . . , f(k), the 
following requirements have to be satisfied. 

• From time t + 1 onwards, there must always be a pebble at a strategic 
location in L\ until the last remaining pebble in G — {L\ U R) = 
{f(k — 1) + 1, ... , f(k)} is removed. Otherwise with at most k — 1 
pebbles, the player cannot cross the unpebbled region L\ (because 
\L\\ = f(k — 1)) to reach and remove the finally last remaining pebble 
in the range G — {L\ U R). There are only k — 1 pebbles available 
because from time t + 1 on we have a pebble in region R, and at least 
one pebble in H = G - (L x U R). 

• From time t + 1 onwards, there must always be a pebble at a strategic 
location in L2 until the last remaining pebble in G — {L\ U L2 U R) = 
{f(k— l) + f(k — 2) + l, . . . , f(k)} is removed. Otherwise, with at most 
k—2 pebbles, the player cannot cross the unpebbled region L2 (because 
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I L2 1 = f(k — 2)) to reach and remove the finally last remaining pebble 
in the range G — (L\ U L2 U R). There are only k — 2 pebbles available 
because from time t + 1 on we have a pebble in region R, a pebble in 
L\ (to help removing the last remaining pebble in L2), and at least 
one pebble in H = G - (L 1 U L 2 U R). 

• By iteration of the argument, there must be a pebble in each region 
Li at time t + 1, for i = 1, . . . , k — 1. 

But these requirements use up k—1 pebbles located in regions L\, . . . , I*k-\. 
None of these regions can become pebble-free before we free the pebble on 
node f(k), that is, the kth pebble. The (fc+l)st pebble is in region R forever 
after step t + 1. Therefore, there is no pebble left to pebble node f(k) — 1 
which is not in R{J{f(k)} Ui=i Hence it is impossible to remove all k 
pebbles from the first nodes 1, . . . , f{k). Thus, leaving one pebble in region 
{1, . . . with at most k remaining pebbles, by inductive assumption, 

the player can pebble no farther than node 2f(k) — 1, which finishes the 
induction. □ 



Lemma 2 There is a winning strategy with n pebbles for Tq = 2 n — 1. 



Proof. Bennett's simulation [ Bennett, 1989(| is a winning strategy. 



We describe his game strategy as the pebble game G = {1, . . . ,Tq}, re- 
cursively. Let Ik = Ik-\ik-\Ik-2ik-2 ■■■ hiiloio where Ij is a sequence of 
2 J — 1 consecutive locations in G, and ij is the node directly following Ij, 
for j = 0, 1, . . . , k - 1. Note that |/ | = 0. 

Let F(k, Ik) be the program to pebble an initially pebble-free interval Ik 
of length 2 fc — 1 of G, starting with k free pebbles and a pebble-free Ik and 
ending with k pebbles on Ik including one pebble on the last node of Ik- 

Let F~ 1 (k,Ik) be the program starting with the end configuration of 
F(k,Ik) and executing the operation sequence of F(k,Ik) in reverse, each 
operation replaced by its inverse which undoes what the original operation 
did, ending with F(k,Ik) 7 s initial configuration. We give the precise proce- 
dure in self-explanatory pseudo PASCAL. 



Procedure F{k,Ik)'- 
for % := 1, 2, . . . , k: 

F(k - i,Ik-i); 

put pebble on node ik-% ; 
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F-^k-iJk-i) 



Procedure F 1 {k,I} t ): 
for i '.= k, k — 1, . . . , 1: 

F-\k-i,I k ^); 

remove pebble on node ; 

F(k - ijk-i) 



Note that this way both F{Q,Iq) and F~ (0, Jo) are 'skip' operations 
which don't change anything. The size Tq of a pebble game which is won 
using this strategy using n pebbles is \I n \ = 2 n — 1. Moreover, if F(k,Ik) 

takes t(k) steps we find t(k) = 2t(k — 1) H h /(l) + fc - 1. Then, = 

3i(fc — 1) — 1. That is, the number of steps T' G of a winning play of a pebble 
game of size T G = 2 n - 1 is T' G « 3 n , that is, l£ ps T^ og3 . □ 

The simulation given in [ Bennett, 1989fl follows the rules of the pebble 
game of length Tq = 2 n — 1 with n pebbles above. A winning strategy for 
a game of length Tq using n pebbles corresponds with reversibly simulating 
Tq segments of S steps of an irreversible computation using S space such 
that the reversible simulator uses T' « ST' G « STq s3 steps and total space 
S' = nS. The space S' corresponds to the maximal number of pebbles on 
G at any time during the game. The placement or removal of a pebble in 
the game corresponds to the reversible copying or reversible cancelation of 
a 'checkpoint' consisting of the entire instantaneous description of size S 
(work tape contents, location of heads, state of finite control) of the sim- 
ulated irreversible machine. The total time TqS used by the irreversible 
computation is broken up in segments of size S so that the reversible copy- 
ing and canceling of a checkpoints takes about the same number of steps as 
the computation segments in between checkpoints. |^| 

We can now formulate a trade-off between space used by a polynomial 
time reversible computation and irreversible erasures. First we show that 
allowing a limited amount of erasure in an otherwise reversible computation 
means that we can get by with less work space. Therefore, we define an 



2 In addition to the rules of the pebble game there is a permanently pebbled initial 
node so that the simulation actually uses n + 1 pebbles for a pebble game with n pebbles 
of length Tq + 1. The simulation uses n + 1 = S' /S pebbles for a simulated number of 
S(Tg + 1) steps of the irreversible computation. 
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m-erasure pebble game as the pebble game above but with the additional 
rule 

• In at most m steps the player can remove a pebble from any node 
i > 1 without node i—1 being pebbled at the time. 

An m-erasure pebble game corresponds with an otherwise reversible com- 
putation using mS irreversible bit erasures, where S is the space used by 
the irreversible computation being simulated. 

Lemma 3 There is a winning strategy with n pebbles and 2m — 1 erasures 
for pebble games G with Tq = m2 n ~ l , for all m > 1. 

Proof. The strategy is to advance in blocks of size 2 n_1 — 1 using n — 1 
pebbles without erasures (as in Lemma ^), put the nth pebble in front, 
and invert the advancement process to free all the pebbles in the block. 
The last remaining pebble has no predecessor and needs to be irreversibly 
erased except in the initial block. The initial pebble is put in front of the 
lastly placed nth pebble which, having done its duty as springboard for this 
block, is subsequently irreversibly erased. Therefore, the advancement of 
each block requires two erasures, except the first block which requires one, 
yielding a total of 2m — 1 erasures. Let G = {1,2,..., Tq} be segmented 
as B\b\ . . . B m b m , where each B, L is a copy of interval I n -i above and bi is 
the node following Bi, for i = 1, . . . ,m. Hence, Tq = m2 n ~ 1 . We give the 
precise procedure in self-explanatory pseudo PASCAL using the procedures 
given in the proof of Lemma ^. 

Procedure A(n, m, G): 
for % := 1, 2, . . . , m: 
F(n-l,Bi); 

erase pebble on node bi-\ ; 
put pebble on node bi ; 

F _1 (n — 1, Bi) (removal of pebble from first node of Bi is an erasure) 

The simulation time T' G is T' G « 2m-T~ l +2 « 2m(T G /m) lo s 3 = 2m 1 ~ lo s 3 T ( !? 
for T G = m2 n ~ 1 . □ 
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Theorem 1 (Space-Irreversibility Trade-off) (i) Pebble games G of size 
2 n — 1 can be won using n pebbles but not using n — 1 pebbles. 

(ii) If G is a pebble game with a winning strategy using n pebbles without 
erasures, then there is also a winning strategy for G using E erasures and 
n — \og(E + 1) pebbles (for E is an odd integer at least 1). 

Proof, (i) By Lemmas ||, |]. 

(ii) By (i), Tq = 2 n — 1 is the maximum length of a pebble game G 
for which there is a winning strategy using n pebbles and no erasures. By 
Lemma |3|, we can pebble a game G of length Tq = m2 n_logm = 2 n using 
n + 1 — logm pebbles and 2m — 1 erasures. □ 

We analyse the consequences of Theorem |]. It is convenient to consider 
the special sequence of values E := 2 k+2 — 1 for k := 0, 1, . . .. Let G be 
Bennett's pebble game of Lemma Q of length Tq = 2 n — 1. It can be won 
using n pebbles without erasures, or using n — k pebbles plus 2 k+2 — 1 
erasures (which gives a gain over not erasing as in Lemma || only for k > 1), 
but not using n — 1 pebbles. 

Therefore, we can exchange space use for irreversible erasures. Such 
a trade-off can be used to reduce the excessive space requirements of the 
reversible simulation. The correspondence between the erasure pebble game 
and the otherwise reversible computations using irreversible erasures that 
if the pebble game uses n — k pebbles and 2 k+2 — 1 erasures, then the 
otherwise reversible computation uses (n — k)S space and erases (2 k+2 — 1)5 
bits irreversibly. 

Therefore, a reversible simulation of an irreversible computation of length 
T = (2 n — 1)S can be done using nS space using (T/S) log3 S time, but is 
impossible using (n — 1)S space. It can also be performed using (n — k)S 
space, {2 k+2 - 1)5 irreversible bit erasures and 2( fc + 1 )( 1 - 1 °g 3 )+ 1 (T/5) lo s 3 5 
time. In the extreme case we use no space to store the history and erase 
about 4T bits. This corresponds to the fact that an irreversible computation 
may overwrite its scanned symbol irreversibly at each step. 

Definition 1 Consider a simulation using 5' storage space and T' time 
which computes y = (x,f(x)) from x in order to simulate an irreversible 
computation using 5 storage space and T time which computes f(x) from 
x. The irreversible simulation cost B s (x, y) of the simulation is the number 
of irreversibly erased bits in the simulation (with the parameters 5, T, T' 
understood). 
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If the irreversible simulated computation from x to f(x) uses T steps, 
then for S' = nS and n = log(T/S) we have above treated the most space 
parsimonious simulation which yields B (x,y) = 0, with y = (x,f(x)). 

Corollary 1 (Space-Irreversibility Trade-off) Simulating aT = (2 n — 
1)S step irreversible computation from x to f{x) using S space by a compu- 
tation from x to y = (x,f(x)), the irreversible simulation cost satisfies: 

(i) B^-^ix, y) < B nS (x, y) + (2 fc+2 - 1)5, for n > k > 1. 

(ii) B( n -V s (x,y) > B nS (x,y), forn> 1. 

For the most space parsimonious simulation with n = log(T/S) this 
means that B s 0°s(T/S)-k) ^ y ^ < B Stog(T/S)( x ^ + ( 2 *+2 _ 

We conjecture that all reversible simulations of an irreversible computa- 
tion can essentially be represented as the pebble game defined above, and 
that consequently the lower bound of Lemma |l| applies to all reversible sim- 
ulations of irreversible computations. If this conjecture is true then the 
trade-offs above turn into a space-irreversibility hierarchy for polynomial 
time computations. 

3 Reversible Computation 

Given that a computation is reversible, either by being reversible a pri- 
ori or by being a reversible simulation of an irreversible computation, it 
will increasingly fill up the memory with unwanted garbage information. 
Eventually this garbage has to be irreversibly erased to create free memory 
space. As before, the number of irreversibly erased bits in an otherwise re- 
versible computation which replaces input x by output y, each unit counted 
as fcTln 2, represents energy dissipation. Complementary to this idea, if such 
a computation uses initially irreversibly provided bits apart from input x, 
then they must be accounted at the same negated cost as that for irreversible 
erasure. Because of the reversibility of the computation, we can argue by 
symmetry. Namely, suppose we run a reversible computation starting when 
memory contains input x and additional record p, and ending with memory 
containing output y and additional garbage bits q. Then p is irreversibly 
provided, and q is irreversibly deleted. But if we run the computation back- 
ward, then the roles of x,p and y, q are simply interchanged. 

Should we charge for the input x or the output yl We do not actually 
know where the input comes from, nor where the the output goes to. Sup- 
pose we cut a computation into two consecutive segments. If the output of 
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one computation segment is the input of another computation segment, then 
the thermodynamic cost of the composition does not contain costs related to 
these intermediate data. Thus, we want to measure just the number of irre- 
versible bit operations of a computation. We can view any computation as 
consisting of a sequence of reversible and irreversible operation executions. 
We want the irreversibility cost to reflect all nonreversible parts of the com- 
putation. The irreversibility cost of an otherwise reversible computation 
must be therefore set to the sum of the number of irreversibly provided and 
the number of irreversibly erased bits. 

We consider the following axioms as a formal basis on which to develop 
a theory of irreversibility of computation. 

Axiom 1 Reversible computations do not incur any cost. 

Axiom 2 Irreversibly provided and irreversibly deleted bits in a computa- 
tion incur unit cost each. 

Axiom 3 In a reversible computation which replaces input x by output 
y, the input x is not irreversibly provided and the output y is not 
irreversibly deleted. 

Axiom 4 All physical computations are effective. 

Axiom 4 is simply an extended form of Church's Thesis: the notion 
of physical computation coincides with effective computation which coin- 
cides with the formal notion of Turing machines computation. Deutsch, 
[ Deutsch, 1985f |, and others have argued the possibility that this is false. If 



that turns out to be the case then either our arguments are to be restricted 
to those physical processes for which Axiom 4 holds, or, perhaps, one can 
extend the notion of effective computations appropriately. 



In reference | Bennett et al, 1993 1 we and others developed a theory of 



information distance with application to the number of irreversible bit op- 
erations in an otherwise reversible computation. A precursor to this line of 
thought is purek, 1989 1. Among others, they considered the information 



distance obtained by minimizing the total amount of information flowing 
in and out during a reversible computation in which the program is not 
retained. 

Since the ultimate limit of energy dissipation by computation is ex- 
pressed in the number of bits in the irreversibly erased records, we consider 
compactification of records. Rather as in analogy of garbage collection by a 
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garbage truck: the cost is less if we compact the garbage before we throw it 
away. 

The ultimate compactification of data which can be effectively exploited 
is given by its Kolmogorov complexity This is a recursively invariant con- 
cept, and expresses the limits to which effective methods can go. Conse- 
quently, the mundane matter of energy dissipation of physical computation 
can be linked to, and expressed in, the pristine rigorous notion of Kol- 
mogorov complexity. 



3.1 Kolmogorov Complexity and Irreversibility Cost 

The Kolmogorov complexity, see []Li and Vitanyi, 1993H , of x is the length 
of the shortest effective description of x. Formally, this can be defined as 
follows. Let x,y,z G Af, where M denotes the natural numbers and we iden- 
tify J\f and {0, 1}* according to the correspondence (0, e), (1, 0), (2, 1), (3, 00), 
(4, 01), .... Hence, the length \x\ of x is the number of bits in the binary 
string x. Let Ti,T2, ... be a standard enumeration of all Turing machines. 
Without loss of generality we assume that all machines in this paper have 
binary input, storage, and output. Consider a standard reversible map- 
ping that maps a pair of integers x,y to another integer (x,y). Similarly, 
{(x,y),z) reversibly maps triplets of integers to a single integer. Let the 
mapping be Turing-computable. 

Definition 2 Let U be an appropriate universal Turing machine such that 
U(((i,p) , y)) = Ti((p,y)) for all i and (p,y). The Kolmogorov complexity 
of x given y (for free ) is 

C{x\y) = min{|p| : U({p,y)) = x,p E {0, 1}*, i e N}. 

Axioms 1 — 4 lead to the definition of the irreversibility cost of a compu- 
tation as the number of bits we added plus the number of bits we erased 
in computing one string from another. Let R = Ri, R2, . . . be a standard 



enumeration of reversible Turing machines, | Bennett, 1975 ]. 

The irreversibility cost of otherwise reversibly computing from x to y is 
the number of extra bits (apart from x) that must be irreversibly supplied 
at the beginning, plus the number of garbage bits (apart from y) that must 
be irreversibly erased at the end of the computation to obtain a 'clean' 
y. The use of irreversibility resources in a computation is expressed in 
terms of this cost, which is one of the information distances considered in 



15 



[Bennett et al, 1992]. It is shown to be within a logarithmic additive term 
of the sum of the conditional complexities, C(y\x) + C(x\y). 



Definition 3 The irreversibility cost Er(x,i/) of computing y from x by a 
reversible Turing machine R is is 

Er(x,v) = min{|p| + \q\ : R((x,p)) = (y,q)}. 
We denote the class of all such cost functions by £. 

We call an element Eq of £ a universal irreversibility cost function, if 
Q £ R, and for all R in R 

EQ{x,y) < E R (x,y) + cr, 

for all x and y, where cr is a constant which depends on R but not on x 
or y. Standard arguments from the theory of Turing machines show the 
following. 

Lemma 4 There is a universal irreversibility cost function in £ . Denote it 
by E UR . 



Proof. In | Bennett, 1973 1 a universal reversible Turing machine UR is 
constructed which satisfies the optimality requirement. □ 

Two such universal (or optimal) machines UR and UR' will assign the 
same irreversibility cost to a computation apart from an additive constant 
term c which is independent of x and y (but does depend on UR and UR'). 
We select a reference universal function UR and define the irreversibility 
cost E(x,y) of computing y from x as 

E(x,y) = E UR (x,y). 

In physical terms this cost is in units of /cTln2, where k is Boltzmann's 
constant, T is the absolute temperature in degrees Kelvin, and In is the 
natural logarithm. 

Because the computation is reversible, this definition is symmetric: we 
have E(x,y) = E(y,x). 

In our definitions we have pushed all bits to be irreversibly provided 
to the start of the computation and all bits to be erased to the end of the 
computation. It is easy to see that this is no restriction. If we have a compu- 
tation where irreversible acts happen throughout the computation, then we 
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can always mark the bits to be erased, waiting with actual erasure until the 
end of the computation. Similarly, the bits to be provided can be provided 
(marked) at the start of the computation while the actual reading of them 
(simultaneously unmarking them) takes place throughout the computation) . 



3.2 Computing Between x and y 

Consider a general computation which outputs string y from input string 
x. We want to know the minimum irreversibility cost for such computation. 



The result below appears in [Bennett et al., 1993] with a different proof. 



Theorem 2 (Fundamental theorem) Up to an additive logarithmic termfy 

E(x,y) = C(x\y) + C(y\x). 

Proof. We prove first an upper bound and then a lower bound. 

Claim 1 E(x,y)<C(y\x) + C(x\y) + 2[C(C(y\x)\y) + C(C(x\y)\x)]. 

Proof. We start out the computation with programs p, q, r. Program p 
computes y from x and \p\ = C{y\x). Program q computes the value C(x\y) 
from x and \q\ = C(C(x\y)\x). Program r computes the value C(y\x) from 
y and |r| = C(C(y\x)\y). To separate the different binary programs we 
have to encode delimiters. This takes an extra additional number of bits 
logarithmic in the two smallest length of elements p, q, r. This extra log 
term is absorbed in the additive log term in the statement of the theorem. 
The computation is as follows. Everything is executed reversibly apart from 
the final irreversible erasure. 

1. Use p to compute y from x producing garbage bits g(x, y). 

2. Copy y, and use one copy of y and g(x, y) to reverse the computation 
to x and p. Now we have p, q, r, x, y. 



Which is Q(mmW( C(y\x)\y),C(C(x\y)\x)}) = O (log mxa{C(y\x),C(x\y)}). It has 



been shown, [ Sacs, 1974 ], that for some x of each length n we have 

logn — log log n < C(C(x)\x), 
and for all x of length n we have 

G(C(x)\x) < logn + 21oglogn. 
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3. Copy x, and use one copy of x and q to compute C(x\y) plus garbage 
bits. 

4. Use x, y, C(x\y) to dovetail the running of all programs of length 
C(x\y) to find s, a shortest program to compute x from y. Doing 
this, we produce more garbage bits. 

5. Copy s, and reverse the computations in Steps |], ||| canceling the extra 
copies and all garbage bits. Now we have p, q, r, s, x, y. 

6. Copy y, and use this copy to compute the value C(y\x) from r and y 
producing garbage bits. 

7. Use x,y,C(y\x), to dovetail the running of all programs of length 
C(y\x) to obtain a copy of p, the shortest program to compute y from 
x, producing more garbage bits. 

8. Delete a copy of p and reverse the computation of Steps [^, ||] canceling 
the superfluous copy of y and all garbage bits. Now we are left with 
x,y,r,s,q. 

9. Compute from y and s a copy of x and cancel a copy of x. Reverse 
the computation. Now we have y, r, s, q. 

10. Erase s, r, q irreversibly. 

We started out with additional shortest programs p, q, r apart from x. We 
have irreversibly erased the shortest programs s,q,r, where \s\ = C(x\y), 
leaving only y. This proves the claim. □ 

Note that all bits supplied in the beginning to the computation, apart 
from input x, as well as all bits irreversibly erased at the end of the compu- 
tation, are random bits. This is because we supply and delete only shortest 
programs, and a shortest program p satisfies C(p) > \p\, that is, it is maxi- 
mally random. 

Claim 2 E(x,y) > C(y\x) + C{x\y). 

Proof. To compute y from x we must be given a program to do so to 
start out with. By definition the shortest such program has length C(y\x). 
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Assume the computation from x to y produces g(x, y) garbage bits. Since 
the computation is reversible we can compute x from y and g(x,y). Conse- 



quently, \g(x,y)\ > C(x\y) by definition [ Zurek, 1989 ]. To end the compu- 
tation with y alone we therefore must irreversibly erase g(x, y) which is at 
least C(x\y) bits. □ 

Together Claims H, ^| prove the theorem. □ 

Erasing a record x is actually a computation from x to the empty string 
e. Hence its irreversibility cost is E(x, e), and given by a corollary to Theo- 
rem ^. 

Corollary 2 Up to a logarithmic additive term, the irreversible cost of era- 
sure is E(x,e) = C(x). 

4 Trading Time and Space for Energy 

In order to erase a record x, Corollary || actually requires us to have, apart 
from x, a program p of length C{C{x)\x) for computing C(x), given x. The 
precise bounds are C(x) < E(x,e) < C(x) + 2C{C{x)\x). This optimum 
is not effective, it requires that p be given in some way. But we can use 
the same method as in the proof of Theorem [2|, by compressing x using 
some time bound t. Using space bounds is entirely analogous. Instead of 
the superscript l t\ we can use everywhere 's', where 's(-)' denotes a space 
bound, or l t, s' to denote simultaneous time and space bounds. 

First we need some definitions as in [ Li and Vitanyi, 1993| ], page 378 



and further. Because now the time bounds are important we consider the 
universal Turing machine U to be the machine with two work tapes which 
can simulate t steps of a multitape Turing machine T in O(ilogt) steps. If 
some multitape Turing machine T computes x in time t from a program p, 
then U computes x in time 0{t logi) from p plus a description of T. 

Definition 4 Let C l (x\y) be the minimal length of binary program (not 
necessarily reversibly) for the two work tape universal Turing machine U 
computing x given y (for free) in time t. Formally, 

C 4 (x|y) = min{|d : U((p,y}) = x in < t(\x\) steps}. 

C*(x|y) is called the t -time- limited conditional Kolmogorov complexity of 
x given y. The unconditional version is defined as C (x) := C*(x, e). A 
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program p such that U(p) = x in < t(\x\) steps and \p\ = C l (x) is denoted 
as x\. 

Note that with Ch[x\y) the conditional t-time- limited Kolmogorov com- 
plexity with respect to Turing machine T, for all x,y, C l (x\y) < C^(x|y) + 
er, where t' = 0(tlogt) and ct is a constant depending on T but not on x 
and y. 

This C*(-) is the standard definition of time-limited Kolmogorov com- 
plexity. However, in the remainder of the paper we always need to use 
reversible computations. Fortunately, in | Bennett, 1989| 1 the following is 



shown (using the simulations refered to in Section |3|) 

Lemma 5 For any e > 0, ordinary multitape Turing machines using T time 
and S space can be simulated by reversible ones using time 0(T) and space 
0(ST e ) (or in 0(T) time and space 0(S + T) ). 

To do effective erasure of compacted information, we must at the start of the 
computation provide a time bound t. Typically, t is a recursive function and 
the complexity of its description is small, say O(l). However, in Theorem || 
we allow for very large running times in order to obtain smaller C (■) values. 
(In the theorem below t need not necessarily be a recursive function i(|x|), 
but can also be used nonuniformly. This leads to a stronger result.) 

Theorem 3 (Irreversibility cost of effective erasure) // t{\x\) > \x\ 
is a time bound which is provided at the start of the computation, then eras- 
ing an n bit record x by an otherwise reversible computation can be done in 
time (number of steps) 0(2^t(\x\)) at irreversibility cost C*(x) + 2C t (t\x) + 
41ogC*(t|x) bits. (Typically we consider t as some standard explicit time 
bound and the last two terms adding up to 0(1).) 

Proof. Initially we have in memory input x and a program p of length 
C l (t,x) to compute reversibly t from x. To separate binary x and binary p 
we need to encode a delimiter in at most 21ogC*(t|x) bits. 

1. Use x and p to reversibly compute t. Copy t and reverse the compu- 
tation. Now we have x, p and t. 

2. Use t to reversibly dovetail the running of all programs of length less 
than x to find the shortest one halting in time t with output x. This 
is x* t . The computation has produced garbage bits g(x,x* t ). Copy 
x*[, and reverse the computation to obtain x erasing all garbage bits 
g{x,x* t ). Now we have x,p,x%,t in memory. 
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3. Reversibly compute t from x by p, cancel one copy of t, and reverse 
the computation. Now we have x,p,x^ in memory. 

4. Reversibly cancel x using x\ by the standard method, and then erase 
x\ and p irreversibly. 

□ 

Corollary 3 The irreversibility cost satisfies 

E(x,e) > lim C\x) = C(x), 
t— too 

and by Theorem [| up to an additional logarithmic term 

E(x,e) = C(x). 

Essentially, by spending more time we can reduce the thermodynamic 
cost of erasure of x\ to its absolute minimum. In the limit we spend the 
optimal value C(x) by erasing x*, since lim^oo x* = x*. This suggests the 
existence of a trade-off hierarchy between time and energy. The longer one 
reversibly computes on a particular given string to perform final irreversible 
erasures, the less bits are erased and energy is dissipated. This intuitive as- 
sertion will be formally stated and rigourously proved below as Theorem ||: 
for each length n we will construct a particular string which can be com- 
pressed more and more by a sequence of about \fnj2 growing time bounds. 
We proceed through a sequence of related 'irreversibility' results. 

Definition 5 Let UR be the reversible version of the two worktape univer- 
sal Turing machine, simulating the latter in linear time by Lemma |^. Let 
E l (x,y) be the minimum irreversibility cost of an otherwise reversible com- 
putation from x to y in time t. Formally, 

E t (x,y) = min + \q\ : UR((x,p}) = (y,q) in < t(\x\) steps}. 

Because of the similarity with Corollary || (E(x, e) is about C(x)) one is 
erroneously led to believe that E t (x,e) = C l (x) up to a log additive term. 
However, the time-bounds introduce many differences. To reversibly com- 
pute x\ we may require (because of the halting problem) at least 0(2l x lt(|x|)) 
steps after having decoded t, as indeed is the case in the proof of Theorem ||. 
In contrast, E t (x,e) is about the number of bits erased in an otherwise re- 
versible computation which uses at most t steps. Therefore, as far as we 
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know possibly C*(x) > E l (x,e) implies t' = Q(2^ x h(\x\)). More concretely, 
it is easy to see that for each x and t(\x\) > \x\, 

E t (x,e)>C t (x)>E t '(x,e)/2, (1) 

with i'(|x|) = 0(t(|x|). Namely, the left inequality follows since E t (x,e) 
means that we can reversibly compute from (x,p) to {e,q) in t(\x\) time 
where \p\ + \q\ = E l (x, e). But this means that we can compute x from q in 
t(\x\) time (reversing the computation) and therefore C*(a;) < \q\. The right 
inequality follows by the following scenario. At the start of the computa- 
tion provide apart from input x also (irreversibly) x^, the shortest binary 
program computing x in at most t(\x\) steps, so \x*\ = C*(x). Prom x\ 
reversibly compute a copy of x in 0(t(|x|)) time, Lemma [|, cancel the input 
copy of x, reverse the computation to obtain x\ again, and irreversibly erase 
x%. 

Theorem ^ can be restated in terms of E t {-) as 

E t \x,e) < C t (x) + 2C t (t\x)+4logC t (t\x), 

with i'd^l) = O^l^tdxD). Comparing this to the righthand inequality 
of Equation [l] we have improved the upper bound on erasure cost at the 
expense of increasing erasure time. However, these bounds only suggest but 
do not actually prove that we can exchange irreversibility for time. Below, 
we establish rigorous time-space- irreversibility trade-offs. 



5 Trade-off Hierarchy 

The following result establishes the existence of a trade-off hierarchy of time 
versus irreversibility for exponential time computations. Q The proof pro- 
ceeds by a sequence of diagonalizations which just fit in the exponential time 
bounds. 
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A superficially similar but quite different result for the time-limited so-called uniform 



Kolmogorov complexity variant C(x; \x\) was given in [Daley, 1973b , but is too weak for 



our purpose. There the time bound t denotes decompression time while in E 1 (x, e) the 
time bound t' relates to compression time. Moreover, the result shows a hierarchy in the 
sense that for certain classes of unbounded functions {/j : i £ AT} (satisfying 2/i+i(n) < 
fi(n)), there exists a recursive infinite sequence uiiu>2 . . . and a recursive sequence of time 
bounds {ti : i G A/"}, such that for each i > 1 there are infinitely many n such that 
C'* (ttii . . .u n ;n) > fi(n) while for al l n we have C ti+1 (ui . . .io n ;n) < fi(n). See also 



Exercise 7.7 in [Li and Vitanyi, 1993]. Note that the set of infinitely many n in the 
statement above may constitute a different disjoint set for each i. Hence, for each pair 
of distinct time bounds there are initial segments of the single infinite sequence which 
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Theorem 4 (Irreversibility-time trade-off hierarchy) For every large 
enough n there is a string x of length n and a sequence of m = | yfn time 
functions t\(n) < t-2,{n) < . . . < t m (n), such that 

E h (x, e) > E t2 (x, e) > . . . > E tm (x, e). 

Proof. Given n, we will construct a string x of length n satisfying the 
requirements of the theorem. String x will be constructed in m steps, and x 
will contain m blocks xi,X2, • • • , x m each of length b = n/m. The idea is to 
make these blocks harder and harder to compress. Define, for 1 < k < m, 

t k (n) = 2 kn . 

In our construction, we will enforce the following things: 

• All m blocks can be compressed iff given enough time. Precisely, x k 
can be compressed to O(logn) size given time, but given tk{ri) 
time Xk cannot be compressed at all. 

• No "collective compression" . If Xk cannot be compressed in time t then 
the concatenation gle string, cannot be compressed 
in time t either. In the construction, we will use only prefixes from 
strings in set Sk which consists of strings that are not compressible in 
time tk(n). 

Algorithm to Construct x 

Initialize: Set So := {0, l} n , the set of all strings of length n, and 
to(n) := and k := 0. 

Repeat For k + 1 := 1, . . . , m: /* Starting the (A;+l)st repetition, the 
first k blocks xi, . . . ,Xk of x have already been constructed and in the 
&th repetition we have constructed a set consisting of strings of 

exhibit different compressions, but not necessarily the same initial segment exhibiting 
pairwise different compressions for more than two time bounds simultaneously, let alone 
a y/n/2 level time-erasure hierarchy for single finite sequences of each length n as in 
Theorem ^. Even if it could be shown that there are infinitely many initial segments, 
each of which exhibits maximally many pairwise different compressions for different time 
bounds, it would still only result in a log n level time-decompression hierarchy for sequences 
of infinitely many lengths n. In contrast, the proof of Theorem ^ also yields the analogous 
\fnj1 level time-decompression hierarchy for Kolmogorov complexity. 
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length n — kb, no element of which can be computed from programs 
of length less than n — kb — 2k in time tp.(n). Furthermore, 



2 n-kb > | 5fc j > 2 n-kb-2k^ ^ I 

Construct x k+ \ from Sjt as follows. Let s be the lexicographic first 
string of length b such that 

\{s' : ss' e S k }\ > 2"-( fc + 1 ) fe - 2fc . (2) 

Such a s exists by Claim ||. Set x k+ \ := s. 

Construct S k+ i from and x k+ i as follows. Let = {s' : 
x k+1 s' G S fc }. We have |S£| > 2 n ~ ( - k+1 ^- 2k by Equation |. Simulate 
each of the programs of length less than n — (k + l)b — 2{k + \) for 
tjt+l(rt)/2 steps. Set Sfc+i to be the set of all strings s' of length 
n — (k + 1)6 such that s' £ S' k and s' is not an output of any of 
the above simulations. We have \S k +i\ > 2 n ~ ( - k+1 ^-^ k+1 \ Trivially, 
2n-(fe+l)6 > This finishes the description of the algorithm. 



Claim 3 There is a string s of length b such that 

\{s' : ss' eS k }\> 2 ™-( fc + 1 ) fe - 2fc . 

Proof. If the claim is false, then the number of elements in S k must 
be less than 

2&2 n— (k+l)b— 2k 2n~kb—2k 

which is a contradiction. □ 

Claim 4 For each k = 1, . . . , m, the sequence of blocks x\, . . . , x k can be 
computed by a O(logn) sized program in time t k +i(n)/n. 

Proof. Using the values of n, b, k and a constant size program we 
can execute the Construction algorithm up to and including the (k — l)th 
repetition in at most 

fc-i fe-i 

Y^2 n ~ ib ~ 2 %{n) < 2 n - h ~ 2 ^2 ni 
i=l i=l 

< 2 n-2v^-2 2 n(fc-i)+i < 2 nfc /2n = t k (n)/2n 
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steps. Subsequently, we can find Xk in at most n|Sfc_i| < tk(n)/2n steps. 
Therefore, in a total number of steps not exceeding tk(n)/n, we can compute 
the list xi, . . . , Xk by a 0(log n) size program. □ 

Claim 5 Let n,b,m,k be as above. Then, E tk (x,e) < n — kb + O(logn). 

Proof. Using Claim ||, we can compute x from an 0(log n) bits program 
and . . . x m (< n — kb + 0(log n) bits), collectively denoted as program 
p, in tk(n)/n time. Trivially, we can compress x using an a program q (con- 
taining n,m,k) with |g| = O(logn) to p in t} s (n)/n time. Using methods 
developed earlier in this paper, we can erase x in an otherwise reversible 
computation irreversibly erasing only \p\ = n — kb + O(logn) bits and irre- 
versibly providing only \q\ bits, in t^{n) time, as follows. By Lemma || the 
overhead incurred by making these computations reversible is only linear. 

1 . Reversibly compute p from x and q , with garbage g(x,p), using 0(tk(n)/n) 
steps. Now we have p,g(x,p). 

2. Copy p, then reverse the computation of Item 1, absorbing the garbage 
bits g(x,p), using at most 0(tk(n)/n) steps. Now we have x,p,q. 

3. Reversibly compute from p to x, with garbage g(p,x); then cancel a 
copy of x, using at most 0(tk(n)/n) time. Now we have x,q,g(p,x). 

4. Reverse the computation of Item 3, absorbing the garbage bits g(p, x), 
leaving only p, q, then remove p and q irreversibly, using at most time 
t k (n)/n). 

In total, above erasing procedure uses 0{tk{n) /n) steps and erases \p\ + l^l 
bits irreversibly and provides \q\ bits irreversibly. This proves the claim. □ 

Claim 6 Let n, b, m, k be as above. Then, E tk (x, e) > n — kb — 2k — 7 log n. 

Proof. Suppose the contrary, and we can reversibly compute (e, q) 
from {x,p), with 

\q\ < E tk (x, e) < n — kb — 2k — 71ogn. 

Then, reversing the computation, in tk{n) time a program q of size at most 
n — kb — 2k — 71ogn can reversibly compute x possibly together with (here 
irrelevant) garbage p. Therefore, this program q plus descriptions of n, m, k 
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of total size at most n — kb — 2k — log n can (possibly non-reversible) compute 
. . . x m in Sk in time %[n). But this contradicts the definition that no 
string in can be (non-reversible) computed in time ifc(n) by a program 
of less than n — kb — 2k bits. □ 

By Claim || using (k + 1) for k, Claim ||, and the assumption that b = 
2y/n, we have for all k such that 1 < k < m, 

E tk (x,e) > E^+^x.e). 

The theorem is proven. □ 

We have demonstrated our theorem for the case when y = e. For y ^ e, 
it is easy to see that the proof still holds if we simply require that \xk\ > \y\ 2 
for each k and make sure y is always an extra input when we simulate all the 
short programs to construct x. Therefore, the theorem can be generalized 
to the following. 

Corollary 4 For every y and every large enough n there is a string x of 
length n and a sequence of m = \\fn time functions ti(n) < t2(n) < . . . < 
tm{ n ), such that 

E ll (x,y) > E t2 (x,y) > ...> E tm (x,y). 

Various different information distances and thermodynamic cost mea- 
sures can be considered. For example, considering only the maximum of 
the irreversibly provided bits or initial program and the irreversibly erased 
bits or final garbage. Following Landauer, [Landauer, 1961 1, we may for the 



energy-dissipation consider only the number of irreversibly erased bits. All 
such measures and also time-limited Kolmogorov complexities exhibit the 
same or very similar time-irreversibility trade-offs by the above proof. The 
result is common to all reasonable cost measures, and the reader is referred 
to flBennett et al, 1993 1 for the fine distictions among them and for their 



physical meanings. 



6 Extreme Trade-offs 

While the time functions in Theorem || are much too large for practical 
computations, they are much smaller than the times required to squeeze 
the irreversibility out of those computations most resistant to being made 
reversible. The following blow-up Lemma |, [ Barzdin', 1968 1 , was one of the 
very first results in 'time-limited' Kolmogorov complexity. 
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Definition 6 Let set A C J\f . Its characteristic sequence x = X1X2 • • • is 
defined by Xi = 1 if i G ^ and otherwise (all i £ jV). If A is recursively 
enumerable (r.e. for short), then we call x an r - e - sequence. 



Lemma 6 (i) There is an r.e. sequence x such that for each total recursive 
function t there is a constant ct (0 < c± < 1 ), such that for each n we have 
C*(Xl • ■■Xn\n) > c t n. 

(ii) Each r.e. sequence x satisfies C(xi ■ ■ -Xn) < 21ogn + c for all n, 
where c is a constant dependent on x {but not on n). 

It follows from Equation |] that E t (x,e) > C t (x) for all time bounds t. 
Then, by Lemma |6] (i), there is a sequence x = X1X2 ■ ■ ■ such that for 
each total recursive time bound t there is a constant c% > such that 
#*(xi---Xn,e) > c t n. 

However, for a large enough nonrecursive time bound T (like T(n) = 
00) we have E T {x\ ■ ■ ■ Xn) = C(xi . . . x n ), for all n. Then, by Lemma || 
(ii) all such sequences x = X1X2 ■ ■ ■ satisfy E T (xi ■ ■ ■ Xn) < 21ogn + c, 
for all n (with c > a constant depending only on x)- These two facts 
together demonstrate that with respect to the irreversible erasure of certain 
strings exponential energy dissipation savings are sometimes possible when 
any recursive time bound whatsoever available for the erasure procedure is 
changed to a large enough nonrecursive time bound. 

Theorem 5 There is a r.e. sequence x an d some (nonrecursively) large 
time bound T, such that for each total recursive time bound t, for each 
initial segment x of x 

E\x,e)>c t 2 ET ^l\ 
where ct > is a constant depending only on t and X- 

The trade-off can be slightly improved for a restricted set of infinitely 
many initial segments of x m t ne sense of dropping the dependency of the 



constant q on t. Using a result [ Daley, 1973a |, page 306 last line, instead of 



Barzdin's Lemma ^| (i), changes the theorem to: 

"There is an r.e. sequence x an d some (nonrecursively) large time bound 
T, such that for each total recursive time bound t, for infinitely many initial 
segments x of x- 

E\x,e)>c2 ET ^/ 2 , 
where c is a constant depending only on x" 
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In other situations the trade-off can be even more extreme. We just 
mention the results and do not explain the esotheric notions involved but 
refer the interested reader to the cited literature. For so-called Mises-Wald- 
Church random binary sequences lo = u)\u)2 ■ ■ ■ where the admissible place- 
selection rules are restricted to the total recursive functions (instead of the 
more common definition using the partial recursive functions) Daley has 
shown the following. (We express his results in the Kolmogorov complexity 
variant called uniform complexity he uses. In [ Li and Vitanyi, 1993| ], Exer- 



cise 2.42, the uniform complexity of x is denoted as C(x;l(x))) 

There are sequences u> as described above such that for each unbounded 
total recursive function / (no matter how small) we have C{lo\ . . . ui n ; n) < 
f(n) for all large enough n, [Daley, 1975 1 , given as Exercise 2.47 Item (c) in 



ULi and Vitanyi, 1993fl . 



Moreover, for all such to and each total unbounded nondecreasing time 
bound t (no matter how great) there are infinitely many n such that C l {u)\ . . .tu n ;n) > 
/2, [paley, 19734 given as Exercise 7.6 in fLi and Vitanyi, 1993| . 



n 



Defining a uniform energy dissipation variant E u (-,-) similar to Defi- 
nitions |, | but using the uniform Kolmogorov complexity variant, these 
results translate in the now familiar way to the statement that the energy- 
dissipation can be reduced arbitrarily computably far by using enough (that 
is, a noncomputable amount of) time. 

Lemma 7 There is a sequence to and a (nonrecursively) large time bound 
T, such that for each unbounded total recursive function f , no matter how 
large, for each total recursive time bound t, there are infinitely many n for 
which 

EMui ...u n ,e)> /(£^(u;i . . . u) n , e)). 
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